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Abstract — A secret key can be used to conceal information 
from an eavesdropper during communication, as in Shannon's 
cipher system. Most theoretical guarantees of secrecy require 
the secret key space to grow exponentially with the length 
of communication. Here we show that when an eavesdropper 
attempts to reconstruct an information sequence, as posed in the 
literature by Yamamoto, very little secret key is required to effect 
unconditionally maximal distortion; specifically, we only need the 
secret key space to increase unboundedly, growing arbitrarily 
slowly with the blocklength. As a corollary, even with a secret key 
of constant size we can still cause the adversary arbitrarily close 
to maximal distortion, regardless of the length of the information 
sequence. 

I. Introduction 

In this work, we consider the Shannon cipher system, first 
investigated in (TJ. The cipher system is a communication 
system with the addition of secret key that the legitimate 
parties share and use to encrypt messages. A classic result by 
Shannon in [ 1 1 states that to achieve perfect secrecy, the size 
of the secret key space must be at least the size of the message 
space. As in |fl~), we consider the secrecy resource to be shared 
secret key, but we relax the requirement of perfect secrecy 
and instead look at the minimum distortion that an adversary 
attains when attempting to reproduce the source sequence. 
The joint goal of Alice (transmitter) and Bob (receiver) is 
to communicate a source sequence almost losslessly while 
maximizing the adversary's minimum attainable distortion. In 
contrast to equivocation, a max-min distortion measure pro- 
vides guarantees about the way in which any adversary could 
use his knowledge; equivocation does not give much insight 
into the structure of the knowledge or how the knowledge can 
be used. 

This measure of security was investigated by Yamamoto 
in the general case where distortion is allowed at the legiti- 
mate receiver. In J2], Yamamoto established upper and lower 
bounds on the tradeoff between the rate of secret key and the 
adversary's distortion. 

In this paper, we solve the problem studied in [2], in the 
case that almost lossless communication is required. We show 
that any positive rate of secret key suffices for Alice and Bob 
to cause the adversary unconditionally maximal distortion (i.e., 
the distortion incurred by only knowing the source distribution 
and nothing else). A positive rate of secret key Rq means 
the number of secret keys is exponential in the blocklength 
n, because there are 2 nR ° secret keys available. However, if 
the secret key space is merely growing unboundedly with n, 
we show that the adversary still suffers maximal distortion. 




Fig. 1. Alice and Bob share secret key K, which Alice uses along with 
her observation of X n to encode a message M. Secrecy is measured by the 
minimum distortion Eve can attain. 



We also show that a constant amount of secret key can yield 
nontrivial distortion at the adversary. 

II. Problem Statement 

The system under consideration, shown in Figure Q] operates 
on blocks of length n. Alice is given an i.i.d. source sequence 
X n — (Xi, . . . , X n ) consisting of symbols drawn from a finite 
alphabet X according to Px- Alice and Bob share secret key 
in the form of a uniform random variable K taking values 
in an alphabet tC. Eve knows the source distribution and the 
operations of Alice and Bob, but does not have access to the 
secret key. At the transmitter, Alice sends M £ M. based 
on the source sequence X n and secret key K; at the other 
end, Bob observes M and K and produces a sequence X n . 
Eve produces a sequence Z n from M and her knowledge of 
the source distribution and system operations (encoder and 
decoder). 

Definition 1. Let k : N — > N. An (n,k(n),R) code consists 
of an encoder f and a decoder g: 



f 
fj ■ 



X n x K M 
MxIC^-X", 



where the size of the message set is \M.\ = 2 nR , and the 
number of secret keys available is \K,\ = k(n). 

We measure secrecy by the distortion between the source 
sequence and an adversary's estimate of the source sequence. 
Given a per-letter distortion measure d : X x Z — > [0, oo), we 
define the distortion between two sequences as the average of 
the per-letter distortions: 

1 - 

d n {x n ,z n ) = ~Yd( Xi ,Zi). 
n * — ' 

i=l 



Without loss of generality, we assume that for all x £ X, there 
exists azfZ such that d(x, z) = 0. 

For a given amount of secret key, we are interested in 
the rate of communication and the distortion incurred by the 
cleverest adversary. 

Definition 2. For a given sequence k(n) and measure of 
distortion d(x, z), we say that the pair (R, D) is achievable 
if there exists a sequence of (n, k(n), R) codes such that 

lim P[X n ^ X n ] = (1) 

and 

liminf min E [d n (X n , z n (M))] > D. (2) 

n->oo z n (m) 

The requirement in (Q3 is that the probability of commu- 
nication error between Alice and Bob vanishes. In (|2), the 
minimum is taken over all functions z n : M. — > Z n , i.e., all 
possible strategies that Eve can employ. Although not explicit 
in the notation, it should be understood that Eve's strategy is 
a function of not only the message M, but also the source 
distribution Px and the (n, k(n), R) code. 

III. Main Result 

The main result is the following theorem. The restriction on 
R, the communication rate, is the same as the classic result for 
source coding. Notice that min z K[d(X, z)] is the distortion 
between X n and the constant sequence (z* , . . . , z*), where 
z* = argmin z E[d(X, z)]. 

Theorem 1. Let k{n) be an increasing, unbounded sequence. 
Then (R,D) is achievable if and only j/0 

R > H(X) 

D < mmE\d(X,z)}. 

z 

If we wanted to consider rates of secret key, we would set 
k(n) = 2 nRo and define (R,R Q ,D) to be achievable if there 
exists a sequence of codes such that ((TJ and (O hold. Then, 
by Theorem Q] we would have that (R,Rq,D) is achievable 
if and only if 

R > H(X) R > H{X) 

Rq > or Ro = 

D < mmE[d(X,z)} D = 

z 

This is the solution to the lossless case of the problem posed in 
0. It should be noted that with the proper choice of auxilliary 
random variables, the converse bound on Ro in [2] is actually 
the trivial bound, Ro > 0. 

With Theorem[T]in hand, we are able to say something about 
the usefulness of a finite amount of secret key. The following 
corollary asserts that the cleverest adversary suffers close to 
maximal distortion even if the number of secret keys stays 
constant as blocklength increases. 

'For simplicity, we ignore the case R = H{X) 



Corollary 1. Fix Px and d(x, z), and denote -D max = 
min 2 ¥,[d(X, z)\. For all D < -D max and R > H(X), there 
exists k* G N such that (R, D) is achievable under k(n) — k*. 

Proof of Corollary Q} Suppose the contrary. That is, 
assume there exists D < D max or R > H(X) such that 
for all k € N, (R,D) is not achievable under k(n) — k. If 
we denote the minimum attainable distortion for blocklength 
n and k secret keys by 

d n> - k = mmE[d n (X n ,z n (M))}, 

z n {m) 

we are asserting that for all (n, k, R) codes, either 

limsupP[X" 56 X n ] > (3) 

n— ¥00 

or Km mfd i<D. (4) 

n-toc n ' K 

In particular, all (n, k, R) codes not satisfying (O must satisfy 
dU, which implies that for all k G N, the sequence d n g is 
strictly less than D infinitely often. To arrive at a contradiction, 
we will define an increasing unbounded sequence k(n) such 
that d n is strictly less than D infinitely often. Since 

D < Dmax, such a k(n) will imply 

contradicting Theorem Q] and completing the proof. To that 
end, first define the increasing sequence {Ni} recursively by 

Ni = min{n > N t _i : d n> e. < D}. 
Then we define k(n) by 

k{n) = £ if Nt-i <n<N e . 

■ 

The proof of Theorem [T| is presented in the next section, 
but first we provide some intuition for why the result holds 
by briefly addressing some of the proof ideas. In designing 
a code, Alice and Bob can use the secret key K to apply 
a one-time pad to part of the message so that the adversary 
effectively knows that the source sequence X n lies in a subset 
B C X n , but is unsure which sequence the true one is. The 
number of sequences in B is \IC\ = k(n), the number of secret 
keys. Under such a scheme, the adversary's optimal strategy 
for minimizing distortion is to output the following symbol on 
the ith step: 

Zi{B) = argmin, 2J p[Jn B] d(x u z) (5) 

Note that © is the expected value of d(Xi, z) conditioned on 
the event {X n £ B}. Now, if each of the sequences in A were 
equally likely to be the source sequence, (© becomes 

Zi(B) = argmin z ^ 7^7 d(xi, z) 
= argmin 2 ^ QA X ) d(x, z), 



B 
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Fig. 2. Consider P x = {§, ±, |}, X = {0, 1, 2}, n = 4, and fe(4) = 8. 
Suppose Eve knows that the source sequence X 4 is a column of B, but 
does not know which column. Since all the columns are equally likely and 
the empirical distribution of each row matches Px, Eve's best strategy is to 
output argmin z E[d(X, z)] at each step (see (6)). For example, if the distortion 
measure were Hamming distance (i.e., d(x, z) = l{x z}), then Eve would 
output (0,0,0,0). 



where Qi(x) denotes the empirical distribution of the ith 
symbols of the sequences in B, i.e., 



1^ 



x}. 



x n £B 



If we could also guarantee that Qi (x) — Px (x) for all x £ X, 
then (O would become 



i(B) =argmm z E[d{X,z)} 



(6) 



In the light of this discussion, we want to design a codebook 
and an encryption scheme so that, roughly speaking, 

p(x n ) 1 

M W\ 



and 



\X n £ B] 



Qi w P x , i = 1, • 



(7) 



. ,71. 



(8) 



Figure |2] gives an example of © and ([8j. These ideas are 
borne out in the proof of Theorem Q] which we now turn to. 

IV. Proof of TheoremQ] 

In preparation for the proof of achievability, we first define 
e-typicality for a distribution P with finite support X: 

T"(P) = {x n £ X n : \Q x n(x) - P{x)\ <e,Vx£ X}, 

where Q x n(x) = — ^ l{ajj = x} is the empirical distribu- 
tion, or "type", of x n . Denote the set of types of sequences 
x n £ X n by V n , and let C V n denote the set of types 
of those sequences x n £ X n satisfying x n £ T™(Px). For 
P £ V n , use \P\ to denote the number of sequences of type 
P. Finally, define the variational distance between distributions 
P and Q by 

\\P-Q\\ = sup \P(A)-Q(A)\. 

A 

We will need a few lemmas. The first three lemmas will aid 
us in asserting ([8j. 

Lemma 1. Let P £ V n . Form a matrix whose columns are 
the sequences with type P, with the columns arranged in any 
order. Then each of the rows of the matrix also has type P. 

Proof: Any permutation applied to the rows of the matrix 
simply permutes the columns. Therefore all the rows have 
identical type. Since the matrix as a whole has type P, each 
of the rows must be of type P as well. ■ 



Lemma 2 (see [3 1). Suppose an urn U contains n balls, each 
marked by an element of the set S, whose cardinality c is 
finite. Let H be the distribution of k draws made at random 
without replacement from U, and M be the distribution of k 
draws made at random with replacement. Thus, H and M are 
two distributions on S k . Then 

\\H-M\\ TV < —. 

n 

Thus, sampling without replacement is close in variational 
distance to sampling with replacement (i.e, i.i.d.) provided the 
sample size is small enough and the number of balls is large 
enough. The rate at which the distance vanishes is important 
to our problem. The next lemma is a lower bound on the size 
of a type class. 

Lemma 3 (see [4]). For P £ V", 

|p| > (7i + i)-i*i|;rr ff(p ) 

The final lemma concerns sufficient statistics in the context 
of our measure of secrecy. 

Lemma 4. Let X, Y , and Z be random variables that form a 
markov chain X — Y — Z and let g be a function on A x Z. 
Define two sets of functions, F = {/ : X x y — >■ .4} and 
F' = {f :y -> A}. Then 

mmE[g(f(X,Y),Z)} = mm M\g(f<Y),Z)\. 

Proof of Lemma \4\ (<) follows from F 1 C F. As for 
(>), we have 

minE[ 5 (/(X, Y),Z)} = ^(x, y)^p(z\v)g(r(x, y), z) 

x,y z 

= ^P{x,y)h{x,y) 

= J2p(vMKX,Y)\Y = y] 

y 

There exists x*(y) such that 

h(x*(y),y) <E[h(X,Y)\Y =y], 
so we define / £ F' by f(y) = f*(x*(y),y). Then 

J2 p(yMh(x, y)\y = y}>J2 p(y) h ( x * (v) > v) 

y y 



= ^2p(y)^2p(z\y)g(f(y), z) 

= E[g(f(Y),Z)} 

> mmE[g(f(Y),Z)} 



Now we begin the proof of Theorem Q] 

Proof of Theorem [7} The proof of the converse is 
straightforward: the converse for lossless source coding gives 
us R > H(X), and Eve can always produce the constant 
sequence (z*,...,z*) so that her distortion never exceeds 
mm z E[d(X,z)]. 



To begin the proof of achievability, fix Px, d(x,z), and 
an increasing, unbounded sequence k(n). Let e > and 
R > H(X). We will show that there exists a codebook 
of 2 nR sequences and an encryption scheme such that 



[X n ^X n ]<e 



(9) 



and 



min E [d n (X n , z n (M))} > minE[d(X, z)} - 5(e) (10) 

z n (m) z 

for sufficiently large n, where 6(e) — > as e — > 0. 

Our codebook, the set of sequences that Alice encodes 
uniquely, consists of the e-typical sequences; thus, (O is 
satisfied by the law of large numbers. For blocklength n, we 
want to consider a partition of the set of typical sequences into 
equally sized subsets (or "bins") of length k(n). A partition 
will let us encode the message in two parts: in the first part, 
we will reveal the identity of the bin that contains the source 
sequence, and in the second part we will encrypt the location 
within the bin by using the secret key to apply a one-time pad. 
Effectively, the second part of the message will be useless to 
Eve. We will denote the set of bins by B, so that each element 
of B is a bin of k(n) sequences. 

For a given partition of the typical sequences, the encoder 
operates as follows. If X n is typical and is the Lth sequence 
in bin J, then transmit the pair (J, L ffi K), where K is the 
secret key and ffi is addition modulo k(n). If X n is not typical, 
transmit a random message. 

In addition to requiring equal-sized bins, we further restrict 
our attention to partitions in which each bin only contains 
sequences of the same typfl and denote the set of bins of type 
P by Bp; thus, B = {J Pe -p n Bp. This restriction addresses Q. 

We claim that there exists a partition so that (TTOb is satisfied. 
To do this, we first select a partition uniformly at random and 
average the minimum attainable distortion over all partitions. 
We use E,,- to indicate that expectation is being taken with 
respect to a random partition. If ( fTOt holds for the average, 
then it must hold for at least one partition. This use of the 
probabilistic method should be distinguished from "random 
binning" that is often used in information theory. In random 
binning, each sequence is assigned to a random bin; in 
particular, the bin sizes are random, whereas here they are 
of size k(n). 

Selecting a partition at random is the same as drawing 
typical sequences without replacement to fill equal-sized bins 
of uniform type. This is also equivalent to first fixing a 
partition B that meets the criteria, then for each P £ V™ 
randomly permuting the sequences in Bp, selecting the \V™\ 
random permutations independently. 

Denoting the left-hand side of ( fTOb by D(n), we first use 

2 More precisely, we focus on partitions in which the number of bins in 
violation is polynomial in n. The set of such partitions is nonempty since the 
total number of types is polynomial in n (see (4)). The forthcoming analysis 
is easily adjusted accordingly. 



min E [d n {X n , z n (J, L © K))] 



minE[d"(X",z™(J))] 

z"0) 



Lemma |4] then restrict attention to typical sequences to get 

E„[D(n)) =E 7r 
= E W 

J2 p(x n )d n (x n ,z n (J(x n ))) 



>E„ 



mm 



Note that although x n is deterministic when inside the summa- 
tion above, the bin J(x n ) that it belongs to is random because 
we are considering a random partition. Summing over bins and 
moving the summation outside, we have 



E n [D(n)} >E T 



min^ p(x n )d n (x n ,z n {J(x n )) 



BeBx n eB 



min P(x n )d n (x n ,z n ) 



BEB 



Next, we sum over types as well, and use the fact that all 
sequences of type P have probability 



cp 



\ X MH(P)+D(P\\P X )) 



to get 



E n [D(n) 
> IE 



E E ™ n E P(x n )d n (x n ,z n ) 

'G-P™ BeBp Z x™eB 

Y E c ^ m i n E 



L PeP? BeB F 



x n £B 



Applying the separability of d n (x n ,z n ) and moving the ex- 
pectation inside, we have 



K[D(n)] 



> 



1 " 

-E E E c ^ m z in E 



i=i pev? BeB F 
cp I 



x n eB 



1 - 

i=i pev" B£B F 



min d(xi 7 z) 



x n £B 



(ID 



Keep in mind that the elements of B are random codewords 
because the partition is random. 

Now we analyze the expectation in ( fTTT ). Viewing Bp as 
a matrix with the constituent sequences forming the columns, 
we denote the ith row by the random sequence (Yi, . . . , Yjpi). 
Furthermore, we let (Yj , . . . , Yy n \ ) denote the ith row of 
B £ Bp; this is acceptable because the forthcoming analysis 
is the same for each row of each bin. For ease of exposition, 
we now refer to k(n) as simply k with the dependence on n 
understood. Thus, we have 



E, 



min y d(xi,z) 



x n eB 



= k 



min y Q Y k(x)d(x,z) 

z ' 



x£X 



where Q Y k is the type of Y k . Denoting the event 
{Y k £ T k (P)} by A, we have by the towering property 
of expectation that 



mm 



> k-F v [A] ■ E 



in E d(%i,z) 

min y Q Y k(x)d(x,z) 



A 



(12) 



Focusing attention on the conditional expectation in ( fT2l . we 
use the definition of typicality and the triangle inequality to 
get 



min Q Y k (x)d(x, z) 



A 



>Er 



min (PxCx) — 2s)d(x, z) 

z ^ 

x£X 

= min } (Px (x) — 2e)d(x, z) 

z £ — ' 

x£X 

> minE[d(X,z)] - £i(e) 



A 



(13) 



where 5i(e) = 2e min z J2 X ^i x i z ) 8 oes to zero as e — > 
because the distortion measure d is bounded. Now we bound 
V[A] in (Ell. We can assume that k(n) E o{\X\ nH ^) without 
loss of generality because Alice and Bob can simply ignore 
extra secret key. Invoking Lemmas Q]|3]to address (|8), we have 



Pv 



Pv 



Y\ p \ 



< 



< 



\X\ ■ k(n) 



\X\ ■ k{n) 



{n + l)-\ x \\X\ nH ( p ) 



< e 



(14) 

(15) 

(16) 
(17) 



for large enough n, where (fT4l > follows from Lemma [T] (fT5t 
follows from Lemma HI and (fTol l follows from Lemma [3] 
By the definition of variational distance and the law of large 
numbers, (fTTI i gives 



\[Y k e t; i (p)] > P nP [r fc e t; i (p)] - e 



> 1 - 2e 



(18) 



for large enough n. The notation Prj p indicates that the 
probability is evaluated with respect to the i.i.d. distribution 
Y\ k P. Now, substituting ([T3]l and dT8j into (ITU , we have 



lin d(a;i, z) 



x n £B 



> k-{mmE[d(X,z)} -S 2 (e)). (19) 



Upon substituting (fT9l into ( fTTT i. we conclude the proof by 
noting that 

n 

-EE E cp • fc ( n ) = p [ x " G T e( P x)} 



i=i pev? BeB F 



V. Conclusion 

If an eavesdropper is trying to reconstruct an information 
sequence in the Shannon cipher system, we have shown 
that even small amounts of secret key enable the cipher to 
cause maximal distortion in the eavesdropper's estimate. Any 
positive rate of secret key will suffice. However, the rate of 
secret key, implying exponential growth in the number of 
secret key assignments, is not even the right way to discuss the 
theoretical limits. Corollary [JJshows that the proper question to 
address is the tradeoff between secret key size and guaranteed 
distortion, respective of the transmission length. 
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for large enough n. 



